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. Abstract 



QQ ' A strong direct product theorem for a problem in a given model of computation states that, in 

order to compute k instances of the problem, if we provide resource which is less than k times the 
resource required for computing one instance of the problem with constant success probability, 
then the probability of correctly computing all the k instances together, is exponentially small 



in k. In this paper, we consider the model of two-party bounded-round public-coin randomized 
C/3 , communication complexity. For a relation / C X xy xZ {X, 3^, Z are finite sets), let R^* 

denote the two-party i-message public-coin communication complexity of / with worst case error 
e. We show that for any relation / and integer fc > 1 



O 
(N 



In particular, it implies a strong direct product theorem for the two-party constant-message 
public-coin randomized communication complexity of all relations /. 

Our result for example implies a strong direct product theorem for the pointer chasing 
problem. This problem has been well studied for understanding round v/s communication trade- 
offs in both classical and quantum communication protocols jNW91[ IKlaOOi IPRVOli IKNTSZOli 
IJRS02j . 

We show our result using information theoretic arguments. Our arguments and techniques 
^ ■ build on the ones used in Jain [Jaill] , where a strong direct product theorem for the two-party 

^ \ one-way public-coin communication complexity of all relations is shown (that is the special case 

of our result when t = 1). One key tool used in our work and also in Jain j Jaill] is a message 
compression technique due to Braverman and Rao [BRllj . who used it to show a direct sum 
theorem for the two-party bounded-round public-coin randomized communication complexity 
of all relations. Another important tool that we use is a correlated sampling protocol, which 
for example, has been used in Holenstein [Hol07] for proving a parallel repetition theorem for 
two-prover games. 
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1 Introduction 



A fundamental question in complexity theory is how much resource is needed to solve k inde- 
pendent instances of a problem compared to the resource required to solve one instance. More 
specifically, suppose for solving one instance of a problem with probability of correctness p, we 
require c units of some resource in a given model of computation. A natural way to solve k 
independent instances of the same problem is to solve them independently, which needs k ■ c 
units of resource and the overall success probability is p^. A strong direct product theorem 
for this problem would state that any algorithm, which solves k independent instances of this 
problem with o{k ■ c) units of the resource, can only compute all the k instances correctly with 
probability at most p^^^^*^) . 

In this work, we are concerned with the model of communication complexity which was 
introduced by Yao |Yao79j . In this model there are different parties who wish to compute a 
joint relation of their inputs. They do local computation, use public/private coins, and com- 
municate between them to achieve this task. The resource that is counted is the number of 
bits communicated. The text by Kushilevitz and Nisan jKN96 ^ is an excellent reference for 
this model. Direct product questions and the weaker direct sum questions have been exten- 
sively investigated in different sub-models of communication complexity. A direct sum theorem 
states that in order to compute k independent instances of a problem, if we provide resources 
less than k times the resource required to compute one instance of the problem with the con- 
stant success probability p < 1, then the success probability for computing all the k instances 
correctly is at most a constant q < 1. Some examples of known direct product theorems 
are: Parnafes, Raz and Wigderson's [PRW97| theorem for forests of communication protocols; 
Shaltiel's |Sha04] theorem for the discrepancy bound (which is a lower bound on the distributional 
communication complexity) under the uniform distribution; extended to arbitrary distributions 
by Lee, Shraibman and Spalek |LSv08| : extended to the multiparty case by Viola and Wigder- 
son |VW08) : extended to the generalized discrepancy bound by Sherstov [Shell j : Ja in, Klauck 



and Nayak's |JKN08j theorem for subdistribution bound; Klauck, Spalek, de Wolf's |KSdW04 



theorem for the quantum communication complexity of the set disjointness problem; Klauck's 
|KlalO| theorem for the public-coin communication complexity of the set-disjointness problem 
(which was re-proven using very different arguments in Jain [Jaill] ): Ben-Aroya, Regev, and de 
Wolf's }BARdW08| theorem for the one-way quantum communication complexity of the index 
function problem; Jain's jJaill] theorem for randomized one-way communication complexity 
and Jain's [Jaillj theorem for conditional relative min-entropy bound (which is a lower bound 
on the public-coin communication complexity). Direct sum theorems have been shown in the 
public-coin one-way model |JRS03a| , public-coin simultaneous message passing model |JRS03aj , 
entanglement-assisted quantum one-way communication model |JRS05|, private-coin simultane- 
ous message passing model jJKOQ] and constant-round public-coin two-way model }BR11) . On 
the other hand, strong direct product conjectures have been shown to be false by Shaltiel }Sha04j 
in some models of distributional communication complexity (and of query complexity and circuit 
depth complexity) under specific choices for the error parameter. 

Examples of direct product theorems in others models of computation include Yao's XOR 
lemma |Yao82| . Raz's |Raz95| theorem for two-prover games; Shaltiel's [Sha04j theorem for fair 
decision trees; Nisan, Rudich and Saks' jNRS99| theorem for decision forests; Drucker's [Drullj 
theorem for randomized query complexity; Sherstov's [Shell] theorem for approximated polyno- 
mial degree and Lee and Roland's [LRllj theorem for quantum query complexity. Besides their 
inherent importance, direct product theorems have had various important applications such 
as in Probabilistically checkable proofs [Raz95] : in circuit complexity [Yao82j and in showing 
time-space tradeoffs [KvdW04| lAvdW09| iKlaTO) . 

In this paper, we show a direct product theorem for the two-party bounded-round public-coin 
randomized communication complexity. In this model, for computing a relation f Q X xy x Z 
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(X,y,Z are finite sets), one party say Alice, is given an input x £ X and the other party say 
Bob, is given an input y G Y. They are supposed to do local computations using public-coins 
shared between them, communicate a fixed number of messages between them and at the end, 
output an element z € Z. They are said to succeed if {x,y,z) G /. For a natural number 
t > 1 and e S (0, 1), let Re*-''^"'^(/) denote the two-party t-message public-coin communication 
complexity of / with worst case error e, that is the communication of the best public-coin 
protocol between Alice and Bob with t messages exchanged between them, and the error (over 
the public coins) on any input (x, y) being at most e. We show the following. 

Theorem 1.1. Let X, y, Z he finite sets, f C X x y x Z a relation, e > and k,t > 1 be 
integers. There exists a constant k such that, 

<!re/.)--/-)('^'^) = ■ ("""'"'^^^ " ? )) ■ 

In particular, it implies a strong direct product theorem for the two-party constant-message 
public-coin randomized communication complexity of all relations Our result generalizes 
the result of Jain |Jaillj which can be regarded as the special case when t — 1. 

As a direct consequence of our result we get a direct product theorem for the pointer chasing 
problem defined as follows. Let n,t > 1 be integers. Alice and Bob are given functions Fa '■ 
[n] — >■ [n] and Fb : [n] — >■ [n], respectively. Let F* represent alternate composition of Fa and 
Fb done t times, starting with Fa- The parties are supposed to communicate and determine 
F*(l). In the bit version of the problem, the players are supposed to output the least significant 
bit of F*{s). We refer to the f-pointer chasing problem as FPt and the bit version as BPj. The 
pointer chasing problem naturally captures the trade-off between number of messages exchanged 
and the communication used. There is a straightforward t-message deterministic protocol with 
t ■ logn bits of communication for both FPt and BPt. However if only t — 1 messages are 
allowed to be exchanged between the parties, exponentially more communication is required. 
The communication complexity of this problem has been very well studied both in the classical 
and quantum models of communication complexity }NW91| IKlaOO| IPRV01|. IKNTSZ01|. IJRS02| . 
The best lower bounds we know so far are as follows (below Q^*-*(-) stands for the t-message 
quantum communication complexity). 

Theorem 1.2. For integer t > 1, 

1. \PRVOlf r[/3'^'P"^(FP() > r!(nlog(*-i) n). 

2. \PRVOlf Rf/3')'P"^(BPt) > n{n). 

3. URS02^ Qf~^^(FPt) > f7(nlog(*"^)n). 

As a consequence of Theorem 1 1.1 1 we get strong direct product results for this problem. Note 
that in the descriptions of FPt and BPt, t is a fixed constant, not dependent on the input size. 

Corollary 1.3. For integers t,k>l, 

1- Rf:^^t.,(FPf)>f^(f •r.log(*-i)n). 

2- R;r^^:^.^,(BPf)>^^ (!•-)• 

^When Y{!i^''^^^{f ) is a constant, then a direct product result can be shown via direct arguments as for example 
in [JaiTT] [Shell] . 
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Our techniques 

We prove our direct product result using information theoretic arguments. Information theory 
is a versatile tool in communication complexity, especially in proving lower bounds and direct 
sum and dire ct product theorems [ChaOl , BYJKS02. JRS03a. JRS03b. JRS 05. JK09. BBCRlOl 
IBRlll IJaillj . The broad argument that we use is as follows. For a given relation /, let the 
communication required for computing one instance with t messages and constant success be 
c. Let us consider a protocol for computing f'' with t messages and communication cost o{kc). 
Let us condition on success on some I coordinates. If the overall success in these I coordinates is 
already as small as we want then we are done and stop. Otherwise we exhibit another coordinate 
j outside of these / coordinates such that the success in the j-th coordinate, even conditioned 
on the success in the I coordinates, is bounded away from 1. This way the overall success 
keeps going down and becomes exponentially small eventually. We do this argument in the 
distributional setting where one is concerned with average error over the inputs coming from a 
specified distribution rather than the worst case error over all inputs. The distributional setting 
can then be related to the worst case setting by the well known Yao's principle |Yao79| . 

More concretely, let /i be a distribution on A" x 3^, possibly non-product across X and y. Let 
c be the minimum communication required for computing / with t-message protocols having 
error at most e averaged over /i. Let us consider the inputs for /'^ drawn from the distribution 
/x*^ {k independent copies of fi). Consider a message protocol V for with communication 
o(fcc) and for the rest of the argument condition on success on a set C of coordinates. If the 
success probability of this event is as small as we desire then we are done. Otherwise we exhibit 
a new coordinate j ^ C satisfying the following conditions: first the distribution of inputs 
XjYj (of Alice and Bob respectively) in the j-th coordinate is quite close to /i; second the joint 
distribution XjYjM (where M is the message transcript of 7^) can be approximated very well by 
Alice and Bob using a t message protocol for /, when they are given input according to /i, using 
communication less than c. This shows that success in the j-th coordinate must be bounded 
away from one. Since we can simulate each message only approximately, in order to keep the 
overall error bounded, we are able to make our argument for protocols with a bounded number 
of message exchanges. 

One difficulty that is faced in this argument is that since fi may be a non-product distribu- 
tion, Alice and Bob may obtain information about each other's input in the j-th coordinate via 
their inputs in other coordinates. This is overcome by splitting the distribution fi into a convex 
combination of several product distributions. This idea of splitting a non-product distribution 
into convex combination of product distributions has been used in several previous works to han- 
dle non-prod uct distributions in different settings |Raz921 IRaz95( IBYJKS02[ IHol07| IBBCR10| 
IBR11| IJaill) . Some important tools that we use in our arguments are a message compression 
protocol due to Braverman and Rao [BRllj and the correlated sampling protocol that appeared 
for example in Holenstein |Hol07) . 

Organization 

The rest of the paper is organized as follows. In Section ^ we present some background on 
information theory and communication complexity. In Section |31 we prove our main result 
Theorem II. 1[ starting with some lemmas that are helpful in building the proof. 

2 Preliminaries 

Information theory 

For integer n > 1, let [n] represent the set {1,2, ... ,n}. Let A", 3^ be finite sets and fc be a 
natural number. Let X'^ be the set A" x • • • x A", the cross product of A" fc times. Let /i be a 
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(probability) distribution on X. Let fJ-{x) represent the probability oi x G X according to /i. 
Let X be a random variable distributed according to /i, which we denote hy X ^ fj.. We use the 
same symbol to represent a random variable and its distribution whenever it is clear from the 
context. The expectation value of some function / on A" is denoted as 



i—X ^ — ' 



The entropy of X is defined to be H(X) =' — J2x /^(^) ' ^ogfi{x). For two distributions fi, A on 

X, the distribution ^ ® A is defined as {^i (g) A)(xi, X2) '= /^(xi) • A(a;2)- Let pt'' ^(^ ■ ■ ■ (g) ^, k 
times. The £1 distance between fi and A is defined to be half of the ii norm of fi — X; that is 

11^- Mill '^^^^ \ ~ A^WI max|As -/^sl, 

X ^ 

where A5 =^ J2xes ^^^^ ^ e-close to /x if ||A — /Lt||i < e. The relative entropy 

between distributions X and Y on X is defined as 

S(X||y)'^^^EPr[X^x].log P'^[^^"j . 

Pr y = a; 

The relative min-entropy between them is defined as 

Soo X r = max <^ log —f- 

xex [ Fr[Y — x\ 

It is easy to see that S(X||y) < Soo(-'^ll^)- Let X, Y, Z be jointly distributed random variables. 
Let Yx be the distribution of Y conditioned on X = x. The conditional entropy of Y conditioned 
on X is defined as ii{Y\X) E^^x[H(y^)] = R{XY)-R{X). The mutual information between 
X and Y is defined as 

I(X;y)1^^H(X)+H(y)-H(Xr)= E [S{Xy\\X)]= E [S{Y^\\Y)]. 

y^Y x^X 

It is easily seen that 1{X\Y) = S{XY\\X ®Y). We say that X and Y are independent ifl[ 
1{X]Y) = 0. The conditional mutual information between X and y, conditioned on Z, is 
defined as 

1{X-Y\Z)'^= E \l{X\Y\Z = z)]=Yl{X\Z) + Yl{Y\Z)~Yl{XY\Z). 

The following chain rule for mutual information is easily seen, 

1{X- YZ) = l(X; Z) + l{X;Y\Z). 

Let X, X', Y, Z be jointly distributed random variables. We define the joint distribution of 

{X'Z){Y\X) by 

Pr[{X' Z){Y\X) = X, z, y] Pr[X' = x,Z = z]- Pr[Y = y\X = x]. 

We say that X,Y,Z is a Markov chain iff XYZ = {XY){Z\Y) and we denote it by X o F O Z. 
It is easy to see that X, Y, Z is a, Markov chain if and only if I(X; Z|y) = 0. Ibinson, Linden 
and Winter [ILW08| showed that if I (X ; y | Z) is small then XYZ is close to being a Markov 
chain. 
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Lemma 2.1 ( [ILW08] ) ■ For any random variables X, Y and Z, it holds that 
1{X; Z\Y) =mm{S{XYZ\\X'Y'Z') : X' ^ Y' ^ Z'} . 

The minimum is achieved by distribution X'Y'Z' = {XY){Z\Y) . 

We will need the following basic facts. A very good text for reference on information theory 
is |CT91] . 

Fact 2.2. Relative entropy is jointly convex in its arguments. That is, for distributions 

S{p^i + (1 -p)//||A + (1 -p)Ai) <p-S{fA\X) + {l-p) -Sifi^X') . 

Fact 2.3. Relative entropy satisfies the following chain rule. Let XY and X^Y^ be random 
variables on A" x 3^. It holds that 

S{X^Y^\\XY) =S{X^\\X) + E [S{Y^\\Y^)]. 

In particular, using Fact 12.21 

S{X^Y^\\X (g)Y) ^S{X^\\X) + E [S{Yj\\Y)] >S{xmx) +S{Y^\\Y) . 

Fact 2.4. Let XY and X^Y^ be random variables on X x y. It holds that 

S{X^Y^\\X (g)Y) > S{X^Y^\\X^ (8)Y^) =l{X^;Y^) . 
Fact 2.5. For distributions A and /i. 



0< ||A-Mlii < VSC%I). 

Fact 2.6. Let A and /i be distributions on X. For any subset S C X, it holds that 

i:A(.)..ogM>_, 

Fact 2.7. The t\ distance and relative entropy are monotone non-increasing when subsystems 
are considered. Let X^Y^X^ ^Y^ be random variables, then 

||xy-x^y^||^ > ||x -x^ll^ and s(xy||x^y^) > s(x||x^) . 

Fact 2.8. For function / : A" x 7?. — > 3^ and random variables X,Y on X and R on TZ, such 
that R is independent of (XY), it holds that 

\\Xf{X,R)-Yf(Y,R)\\, = \\X -Y\\,. 

The following definition was introduced by Holenstein |Hol07| . It plays a critical role in his 
proof of a parallel repetition theorem for two-prover games. 

Definition 2.9 f [HolQ7| ). For two distributions {XqYo) and (XiSYiT), we say that {Xo,Yo) 
is (1 — e)-embeddable in {XiS,YiT) if there exists a probability distribution R over a set TZ, 
which is independent of XqYo and functions : A" x 7?. — > 5, /s : 3^ x 7?. — >■ T, such that 

\\XoYofA{Xo,R)fB{Yo,R) - XiYiSTW^ < e. 

The following lemma was shown by Holenstein |Hol07| using a correlated sampling protocol. 
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Lemma 2.10 (' |Hol07j ). For random variables S, X and Y, if 

\\SXY-{XY){S\X)\\,<s 

and 

\\SXY-{XY){S\Y)\\,<e, 
then {X,Y) is (1 - Ae)-emheddahle in {XS.YS). 

We will need the following generalization of the previous lemma. 
Lemma 2.11. For joint random variables {A' , B' ,C') and (A,B), satisfying 

SiA'B'WAB) <e 
E [S(b;JB,)]<£ and 

(a,c)<-A',C' 
{b,c)-(^B ,C 

it holds that {A, B) is (1 - 5^/s)-embeddable in {A'C',B'C'). 

Proof. Using the definition of the relative entropy, we have the following. 

Pr[B' = b\A' = a] 



E [S{B'^^\\Ba)]- E [S(b;Jb1)]= E 



log 



Pt[B = b\A = a] 



This means that 



Then 



= E }S{B'jBa)] > 0. 

a-f-A 



, E ,[S(B:,Ji?l)]< E [S(i?lJ|i?<,)]<£. (1) 

(a,c)<-yl ,C {a,c)^A ,C 



E, X^{Kc\\B'a)]^HA'B'C'\\{A'C'){B'\A')) (2) 

(a,c)^A ,C 

= S{A'B'C'\\{A'B'){C'\A')) (3) 
> WA'B'C - (A'B') {C'\A')\\l . (4) 

Above, Eq. ([2]) follows from the definition of the relative entropy, Eq. follows because 
{A'C) iB'\A') and {A'B') {C'\A') are identically distributed, and Eq. g]) follows from Fact [231 
Now from Equations (|3]) and ([TJ we get 

IIA'B'C'-(A'B') {C'\A')\\^<V^. 

By similar arguments we get 

WA'B'C -{A'B') (C'lB')lli < V^- 

The inequalities above and Lemma l^JUl implv that {A', B') is (1 - 4^)-embeddable in {A'C , B'C). 
Furthermore from Fact [231 and S{A'B'\\AB) < e we get 

\\A'B' -AB\\^ < y/I. 

Finally using the inequality above and Fact 12.81 we get that {A,B) is (1 — 5-ye)-embeddable in 
{A'C, B'C). □ 
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Communication complexity 

Let fCXxyxZhea, relation, i > 1 be an integer and £ G (0, 1). In this work we only 
consider complete relations, that is for every (z, y) € X x y, there is some z € Z such that 
(x, y, z) G /. In the two-party i-message public-coin model of communication, Alice with input 
x G A" and Bob with input y G 3^, do local computation using public coins shared between them 
and exchange t messages, with Alice sending the first message. At the end of their protocol 
the party receiving the t-th message outputs some z € Z. The output is declared correct if 
{x, y, z) ^ f and wrong otherwise. Let R£*''''^"^(/) represent the two-party i-message public-coin 
communication complexity of / with worst case error e, i.e., the communication of the best 
two-party t-message public-coin protocol for / with error for each input (x, y) being at most e. 
We similarly consider two-party f-message deterministic protocols where there are no public 
coins used by Alice and Bob. Let fi X x y he a distribution. We let D£*^''^(/) represent the 
two-party t-message distributional communication complexity of / under /x with expected error 
e, i.e., the communication of the best two-party i- message deterministic protocol for /, with 
distributional error (average error over the inputs) at most e under /i. Following is a consequence 
of the min-max theorem in game theory, see e.g., |KN961 Theorem 3.20, page 36]. 

Lemma 2.12 (Yao's principle, |Yao790 . Ri*^'P"''(/) = max^ Di*^'^(/). 

The following fact about communication protocols can be verified easily. 

Fact 2.13. Let there be t messages Mi,...,Mt in a deterministic communication protocol 
between Alice and Bob with inputs X,Y respectively where X and Y are independent. Then 
for any s € [i] , AT and Y are independent even conditioned on Mi , . . . , Ms . 



3 Proof of Theorem 11.11 

We start by showing a few lemmas which are helpful in the proof of the main result. The 
following lemma was shown by Jain |Jaill| and follows primarily from a message compression 
argument due to Braverman and Rao jBRll| . 

Theorem 3.1 f [BRll[ [Jaill] ). Let S > 0,c > 0. Let X' ,Y' ,N be random variables for which 
Y' ^ X' ^ N is a Markov chain and the following holds, 



Pr 

{x,y,m)-(-X',Y',N 



' Pr\N = m\X' = x] 
log — ■ > c 

^ Ft[N = m\Y' = y] 



< S. 



There exists a public-coin protocol between Alice and Bob, with inputs X',Y' respectively, with 
a single message from Alice to Bob of c + O{\og{l/ S)) bits, such that at the end of the protocol, 
Alice and Bob both possess a random variable M satisfying \\X'Y'N — X'Y'M\\-^ < 26. 

We will need the following generalization of the above. 

Lemma 3.2. Let c > 0, 1 > e > 0, e' > 0. Let X',Y',M' be random variables for which the 
following holds, 

I(A'; A/'|r') < c andl{Y';M'\X') < e. 

There exists a public-coin protocol between Alice and Bob, with inputs X',Y' respectively, with 
a single message from Alice to Bob of -\- ©(log bits, such that at the end of the protocol, 
Alice and Bob both possess a random variable M satisfying \\X'Y' M' — X'Y'M\\^ < -\- 6e' . 

Proof. Let us introduce a new random variable with joint distribution X'Y'N = {X'Y'){M'\X') 
Note that Y' ^ X' ^ N is a. Markov chain. Using Lemma I^TTl we have 

S{X'Y'M'\\X'Y'N) = I(y'; M'\X') < e. 
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Applying Fact [131 we get that \\X'Y' M' ~ X'Y' N\\^ < Using this, the following claim, 
and Theorem 13. II we conclude the desired. □ 



Claim 3.3. 



Pr 

(m,x,y)^M,X' ,Y' 



, Pr[7V = m|X' = d c + 5' 

Ice — ' > 

^ Pt[N = m\Y' = y] - e' 



< 3e' + 



Proof. For any m, x, y it holds that 
log ^^i^ " " - losr ^^^^ " = y] 



Pr[iV = m|y' = y] 



Pr[iV = m|r' = y] 

, Pt\N ^ mix' ^ x,Y' = y] , Pr[M' = mlX' = a;, F' = wl 
log „ L_ L__ — ilL + log ■ ^ ' ' 



Pr[M' = m\X' ^x,Y' = y] 

, Pr[M' ^ TO,y' yl 
hlog- 



Pr[Af' = m\Y' = y] 



Pr[iV = m,y' = y] ' 
We bound each term above separately. For the first one, let us define the set 



(5) 



dcf 



Gi = \ {■m,x,y) : log 



Pt[N ^ m\X' ^ x,Y' = y] e + l 



Pr[M' = m\X' = x,Y' = y] 



< 



Consider, 
> 



{x,y)^X ,Y 



E 



log 



Pr[iV = m\X' = x,Y' = y] 



(m.x,y)^M' ,X' .Y 

= ^ Pr [M' = m, X' = X, y = y] • lo; 



Pr[M' = m\X' =x,F' = y] 

Vt[N = m\X' = x,Y' = y] 



(6) 



(m,x,y)^Gi 



^ Pr [M' = m,X' = x,Y' = y]-\o, 



Pr[M' = m\X' ^x,Y' ^ y] 

Pr[iV = m\X' ^x,Y' ^ y\ 



> Pr [M' = m,X' = X, Y' = y] ■ lo. 

{m,x,y)£Gi 

+ Pr[(Af',X',r')^Gi]-^ 



^ Pr[M' = m, X' = x, y = y] • log 

(m,x,y)^Gi 

~^{M'X'Y'\\NX'Y') + Vt:[{M' ,X' ,Y') i d] 
> -1 - e + Pr[(M', X', Y') ^ Gi] • 



Pr[M' = m\X' = x,Y' = y] 



Pr[iV = m\X' = x,Y' = y] 



Pr[M' 


= m\X' 


= x,Y' 




Pr[M' 


= m\X' 


= x,Y' 


= v] 


Pr[Ar 


= m\X' 


= X, Y' 


= v] 




e+l 







(7) 

(8) 
(9) 



Above, Eq. ([6]) and Eq. ([8]) follow from the definition of the relative entropy, and Eq. ([7]) 
follows from the definition of Gi. To get Eq. ([9]), we use Fact 12.61 Eq. ^ implies that 
Pr[(M',X',r') i Gi] < e'. 

To upper bound the second term let us define 



dcf 



G2 = <^ {m,x,y) : log 



Pr[M' = m\X' = x,Y' ^y] ^ c + 1 



Pr[M' = m\Y' = y] 
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Consider, 

c> I{M';X'\Y') 

E 

{m,x,y)-i-M\X',Y' 



log 



Pt[M' =m\X' = x,Y' = y] 



Pr[M' = m\Y' = y] 



(10) 
(11) 



Pr[M' = m,X' = x,Y' = y]-lo^ 

(m,x,y)^G2 



Pr[M' = m\X' = x,Y' ^ y] 



+ ^ Pr[M' = m,X' a;,y' ^ y] -log 



{m.x.:y)^G2 

> £±l.Pr[(M',X',y')^G2]-l. 



Pr[M' = m|y' =y] 

Pr[M' = m|X' = a;,y' = y] 
Pr[Af' = m|y' = y] 



(12) 



Above Eq. (|10p is one of the assumptions in the lemma; Eq. (fTTI) follows from the definition 
of the conditional mutual information; Eq. (|12p follows from the definition of G2 and Fact [ 
Eq. (HH) implies that Pr[(M', X', F') ^ G2] < e'. 
To bound the last term define 



G; 



dcf 



, , PrfAf = m,r' = ?/] £ + 1 

(rn, X, ?/) : log „ , , , rr: < 



Pr[7V = m,r' = y] 



Consider, 



e > S(X'r'Af'||X'r'7V) 
> S{Y'M'\\Y'N) 



(13) 



E 



log 



Pr[M' = m,y' = y] 
Pr[7V = TO, y = y] 



^ Pr[M' = m,X' = a;,r' = y] -lof 

(m,a;,y)GG3 



+ ^ Pr[A'/' = TO, X' = Y' = y]- log 

(m,a:,i/)^G3 

>-l + Prp.f',X',y')^G3]-^. 



Pr[Af' = TO,r' = y] 
Pr[7V = TO,y' = y] 

Pr[Af' = TO,y' = y] 



Pr[7V = TO,r' = y] 



(14) 



Above Eq. (1131) follows from Fact 12.71 and Eq. (jT4|) follows from definition of G3. This implies 
Pr[(Af',X',y') i G3] <e'. 

On combining the bounds for the three terms, using Eq. ([S|) and using the union bound we 
get (recah 1 > e > 0) 



Pr 

(■m,x,y)<-M' ,X' .Y' 



■ Vv\N^m\X'^x\ c + 5' 

log — ^ > 



<3e'. 



Pr[7V = ■m\Y' = y] ~ e' 
Now using llX'y'A/' — X'Y' N\\^ < %/e (as was shown previously), we finally have, 

Pr log -;tt7T — > — — < 3e + 

{■m,x,y)<-N,X',Y' 



' Fr[N = m\Y' = y] ~ e' 
We will need the following further generalization of the previous lemma. 



□ 



Lemma 3.4. Let t > 1 be an integer. Let e' > 0, Cs > 0,1 > Ss > for each 1 < s < t. 
Let R' , X' ,Y' , M[, . . . , M[, be random variables for which the following holds (below Af^^ 



1{X'; NQY' R' M'^^) < Cs, 1{Y'- Af^|X'i?'A/^J < e^, for odd s 
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and 

1{Y'- A-QX'R'M'^^) < Cs, 1{X'- AQY'R'M'^,) < e,, for even s. 

There exists a public-coin t-message protocol Vt between Alice, with input X' R' , and Bob, with 
input Y' R' , with Alice sending the first message. The total communication is 



e' 

and at end of the protocol, both Alice and Bob possess random variables Mi, . . . , Mt, satisfying 

t 

WR'X'Y'Mi ■ ■ ■ Mt - R'X'Y'M[ • • • M^' || i < 3 ^ ^el + Qe't. 

s=l 

Proof. We prove the lemma by induction on t. For the base case t = \, note that 

1{X' R' ; M[\Y' R') ^ 1{X' ■ M[\Y' R') < ci 

and 

l(Y'R';M[\X'R') = 1{Y' ; M[\X' R') < ei. 

Lemma [321 imphes (by taking X', Y,' M' in Lemma lO to be X'R', Y'R', M{ respectively) that 
Alice, with input X'R' , and Bob, with input Y' R' , can run a public-coin protocol with a single 
message from Alice to Bob of 

bits and generate a new random variable Mi satisfying 

\\R'X'Y'M[ - R'X'Y'MiWi < 3y/el+6e'. 

Now let t > 1. Assume t is odd, for even t a similar argument will follow. From the induction 
hypothesis there exists a public-coin t — 1 message protocol Vt-i between Alice, with input 
X' R' , and Bob, with input Y' R' , with Alice sending the first message, and total communication 

g"'%:^"-" +o(«-i)iogi). ,15, 

such that at the end Alice and Bob both possess random variables Mi, . . . , Mt-i satisfying 

WR'X'Y'Mi ■ ■ ■ Mt-i - R'X'Y'M'i ■ ■ ■ M^_i\\i < 3^ + 6e'(t - 1). (16) 

s=l 

Note that 

I(y'i?'Af<t; |A:'i?'M<t) = I(r'; M't\X' R' M'^t) < a 

and 

l{X'R'M'^t-M't\Y'R'M'^t) = I(A:'; M't\Y' R' M'^t) < Sf 

Therefore Lemma O implies (by taking X', Y,' M' in LemmaE^lto be X'R'M'^t, Y' R' M'.^^, Mj. 
respectively) that Alice, with input X'R' M'^t, and Bob, with input Y' R'M'^i, can run a public 
coin protocol V with a single message from Alice to Bob of 

iii^+oflogi) (17) 
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bits and generate a new random variable M" satisfying 

\\R'X'Y'M[ ■ ■ ■ M[_^M[ - R'X'Y'M[ ■ ■ ■ M[_^M['\\^ < 3Ve7 + 6e'. (18) 

Fact [SHI and Eq. ^ imply that Alice, on input X'R'M^t and Bob on input Y'R'M^t, on 
running the same protocol V will generate a new random variable Mt satisfying 

WR'X'Y'Mi ■ ■ ■ Mt^iMt - R'X'Y'M[ ■ ■ ■ M[_^M['\\i 
= WR'X'Y'Mi ■ ■ ■ Mt-i - R'X'Y'M{ ■ ■ ■ m;_i||i 
t-i 

<3^V^+6£'(t-l). (19) 

s=l 

Therefore by composing protocol Vt-i and protocol V and using Equations (fT5|) . (fTTl) . (fTSl) . 
I|19p we get a public-coin i-message protocol Vt between Alice, with input X'R' , and Bob, with 
input Y'R' , with Alice sending the first message, and total communication 

such that at the end Alice and Bob both possess random variables Mi , . . . , Mt satisfying 

t 

WR'X'Y'Mi ---Mt- R'X'Y'M'i • • • m;|| i < 3 ^ VeJ + Qe't. □ 

s=l 

Following lemma, obtained from the lemma above, is the one that we will finally use in the 
proof of our main result. 

Lemma 3.5. Let random variables R' , X' , Y' , M[, . . . , M^ and numbers e' , Cs, satisfy all the 
conditions in Lemma \3.4\ Let r > and let random variables {X,Y) be (1 — T)-embeddable in 
(X' R' , Y' R'). There exists a public-coin t-message protocol Qt between Alice, with input X , and 
Bob, with input Y , with Alice sending the first message, and total communication 

bits, such that at the end Alice possesses RaMi ■ ■ ■ Mt and Bob possesses RbMi ■ ■ ■ Mt, such 
that 

t 

WXYRaRbMi ■■■Mt- X'Y'R'R'M[ • • • Af;ili < r + 3 ^ + 6e't. 

s=l 

Proof. In Qt, Alice and Bob, using public coins and no communication first generate Ra,Rb 
such that WXYRaRb - X'Y'R'R'Wi < t. They can do this from the Definition of embed- 
ding. Now they will run protocol Vt (as in Lemma [3^ with Alice's input being XRa and Bob's 
input being YRb and at the end both possess Mi, . . . , Mt- From Lemma [3^ the communication 
of Qt is as desired. Now from Fact 12.81 and Lemma [33] 

t 

WXYRaRbMi ■■■Mt- X'Y'R'R'M'i ■ ■ ■ M'tWi < t + Z^^^ + &e't. □ 

s=l 

We are now ready to prove our main result, Theorem ll.il We restate it here for convenience. 

Theorem 11.11 Let X , y, Z be finite sets, f C X x y x Z a relation, e > and k,t > 1 be 
integers. There exists a constant k such that. 
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Proof of Theorem II. It Let c ^ '(/) — ^ for k to be chosen later. Let S = ■j^qq^2 and 
^1 = sooot • From Yao's principle, Lemma 12.121 it sufBccs to prove that for any distribution fi 
onX xy, 

Let XY - Let Q be a i -message deterministic protocol between Alice, with input X, and 
Bob, with input Y, that computes f'', with Alice sending the first message and total commu- 
nication 6ikc bits. We assume t is odd for the rest of the argument and Bob makes the final 
output (the case when t is even follows similarly). The following Claim [3^ implies that the 
success of Q is at most (1 — and this shows the desired. □ 



Claim 3.6. For each i E [k], define a binary random variable Ti E {0, 1}, which represents the 
success of Q (that is Bob's output being correct) on the i-th instance. That is, = 1 if the Q 



computes the i-th instance of / correctly, and Ti = otherwise. Let k' 
coordinates {ii, . . . ,ik'} such that for each 1 < r < fc' — 1, either 



dcf 



[(5fcJ. There exist k' 



or 



Pr 



Pr 



<(l-£/2)* 



1 



where T« '^^^^ [] T,^ 



Proof of Claim (Uni For sE [t], denote the s-th message of Q by Ms- Define M =^ Afi • • • Mj. 
In the following we assume 1 < r < A;', however same arguments also work when r = 0, that is 
for identifying the first coordinate, which we skip for the sake of avoiding repetition. Suppose 
we have already identified r coordinates ii,. .. ,ir satisfying that Pr[rij = 1] < 1 — e/2 and 
Pr[Ti^.^, = 1|T(^) = 1] < 1 - e/2 for 1 < j < r - 1. If Pr [T^'') l] < (1 - e/2)'=', we are done. 
So from now on, assume Pr[r('') l] > (1 - e/2)'=' > 2-^''. 

Let D be a random variable uniformly distributed in {0, l}*^ and independent of XY. Let 
Ui = Xi if Di = 0, and Ui = Yi if Di = 1. For any random variable L, let us introduce 

the notation: =^ = 1). For example, X^Y^ 

dcf' r r T- , r dcf 



define L_j = Li • • • L^^iLi+i ■■ - Lk, and L<i = Li • • • Li_i 

dcf r- . 1 n r. dcf 



(Xr|TM = 1). li L = Li---Lk, 
Random variable L<ci is defined 
[fc]. We 



^culi~i]Ycu[i~i] for « e 



analogously. Let C — Define Ri = D^iU. 

denote an element from the range of Ri by ri . 

To prove the claim, wc will show that there exists a coordinate j ^ C such that, 

1. (XjY^) can be embedded weU in (Xji?], y/i?]). 

2. Random variables Xj , Y^^ , Ml , . . . , satisfy the conditions of Lemma [53] with appropri 
ate parameters. 

Following is helpful in meeting the first condition. 



Sk > Soo (x^y'^Wxy) 

> S(X^Y^\\XY) > 



E 

lie 



S{XIYMX,Y^ 



(20) 
(21) 
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where Eq. (1^ follows from the assumption that Pr [T^*") = l] > 2 and Eq. (1^ is from 
Fact 12.31 Also consider, 



Sk > Soo {X^Y^D^U^\\XYDU) 
> S{X^Y^D^U^\\XYDU) 



> 



{d,u,xc,yc)^D\u\X^,Y^ 



S X'Y' 



VC 



E 

i(tC 



E 



{d,u,xcu[i~l],yc'j[i-l]) 



XCU[i-l] >yCU[i-l] 



{XiYi),^ 



CU[i-l] CU[i-l] 



^(d,,Ui,ri)^D],Ul ,R] 



(22) 



(23) 
(24) 



E 



S K-i 



(^4 J +Je 



E 



iiC 



(25) 



Above, Eq. ^ and Eq. ^ follow from FactO Eq. ^ is from the definition of i?,. Eq. ([13 
follows since D] is independent of R] and with probability half D] is 0, in which case U} = X} 
and with probability half D] is 1 in which case C// = F^"'^ . 

Following calculations are helpful in meeting the second condition. 



6ick > \M^ 
1{X 

J:^{x!y,';M' 



> 1{X^Y^;M^\D^U^X^Y^) 



-C"^ C^^-'^cu [i- 1] ^cu[i- 1] 



EEi(^^^^'^ 



^ ^ ^ -'^cu [i - 1] ^cu [i - 1] ^'^< s 



= E ( E ^{.^\yl:Ml\D\ulR\M\,) + l{xlYl;Ml\DlulRlMl 

i^C \s odd s even 

^ ^ E ( E I (^^ I RlY^M^^,) + J2 I ; I ^I^' 1 ,) J . 

i^C \s odd s oven / 



(26) 



Above we have used the chain rule for mutual information several times. Last inequality follows 
since is independent of {XlY^ RjM^) and with probability half Dl is 0, in which case 
Ul = Xl and with probability half is 1 in which case J7/ — Y^ . 
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For the following, let s € [t] be odd. 

Sk > Soo {d^u^x^yHi^,\\duxym<s) 

> S{D^U^X^Y^M^,\\DUXYM<s) 

— ^, , , , , [^((^ ^ )d,u,xc,yc,m<:^\\{XY)(i^u,xc,yc,"i.^g)'\ 

(d.u,xc;yc-rn<^)^D^,U^,Xc.Yc,M^, " 

^ / ! ^ , [^{i-^i'^i )d,u,xcu[i-l],ycu[i-l],'rn<3\\i-^i'^i)d,u,xcu[i-l],Vcu[i-l],'m<a)] 

(d,M,a;cu[i-l],ycu[i-l] >™<s) 



/, ^ , , , , [^{i^i^i')di,Ui,ri,m<,\\{XiYi)di,u,,ri,m<J] 



(27) 



lie 



2^1 

-E 

2^1 



I (■^)a:i ,ri,m<s ) ] 



r) / - ^ , , , [^((-^i )2;i,ri,m<s II (^)xi,ri,m<s)] 

^ ~^{x,,ri,7n<,)'r~Xl,R\,M<3 



(28) 



<.;(Mi).,,„,„^J] 

= \Y.^{Yl;Ml\XlR]Ml,). 



2 ^^{xi,ri,m^s)^Xi ,R-,Mi.^ 

>-y E [i((r/),. 



(29) 
(30) 



Above we have used Fact 12.31 several times. Eq. (1271) follows from the definition of Rf, 
Eq. ([28|) follows from the fact that Y -n- XiRiM^g ^ Ms for any i, whenever s is odd; Eq. (|29|) 
follows from Fact 12.41 

From a symmetric argument, we can show that when s G [t] is even. 



Eq. dSni) and Eq. together imply 



^ f ^ l(r/;Mi|i?ixiMi,) + ^ l(Xi;Mi|i?iK/Mi,) j < 25kt. 

i^C \s odd s even / 



(31) 



(32) 



Combining Equations (1211 ([251) (|26)) ([32|) . and making standard use of Markov's inequality, we 
can get a coordinate j ^ C such that 



SiXjY^mX,Y,)<l2S, 



E 

(r,,x,)^R],X'^ 
E 



< 12(5, 



((^/).,,.J|(^.).,); 



s odd 



l{Yl;Ml\R]XjMl) + J2 Hxj;Ml\RjYlM'^^) < USt. 



(33) 
(34) 

(35) 

(36) 

(37) 



s odd 
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Set e' =^ and 



dcf {l{Yl ; M] I R]X]M\^) s e \t\ odd, 
~ \\{X];Ml\R)YlM\}) s e [t] even. 



dot 



Mi |7?]XjMi,) s e [t] even, 
^ I (Xj ; Ml I R] y/ Ml , ) s e [t] odd. 



By dST]), < V^^t. From Equations (|33l) dMl) dSS]) and Lemma [2?TT] we can infer that 

{XjYj) is (1 - 10V^)-embeddable in (X^i?]; F/i?]). This, combined with Equations 
and Lemma [5T5l ftake e', EgjCg in the lemma to be as defined above and take XYX'Y' R' M[ ■ ■ ■ M[ 
in the lemma to be XjYjXjVj^RjMl ■ ■ ■ M}) imply the following (for appropriate constant k). 
There exists a public-coin t-message protocol between Alice, with input Xj, and Bob, with 
input Yj, with Alice sending the first message and total communication. 



+ 0(tlogi)<D(*)'^(/), 



such that at the end Alice possesses RaMi ■ ■ ■ Mt and Bob possesses RbMi ■ ■ ■ Mt, satisfying 

WXjYjRaRbMi ■ ■ ■ Mt - X]yIr]r]mI ■■■MI\\^< IQVz5 + 3Vm< + 6e't < e/2. 

Assume for contradiction that Pr [Tj = iIt^*") = l] > 1 — e/2. Consider a protocol (with 
no communication) for / between Alice, with input XjRjM^ ■ ■ ■ , and Bob, with input 
Yj^RjMl ■ ■ ■ Mf, as follows. Bob generates the rest of the random variables present in Y^ (not 
present in his input) himself since, conditioned on his input, those other random variables are 
independent of Alice's input (here we use Fact I2.13|) . Bob then generates the output for the 
j-th coordinate in Q, and makes it the output of Q^. This ensures that the success probability 
of Bob in jg p^.^^. ^ i|y(r) = i] > i _ ^/2, Now consider protocol for /, with Ahce's 
input Xj and Bob's input Yj, which is a composition of followed by Q^. This ensures, using 
Fact 12. 8[ that success probability of Bob (averaged over public coins and the inputs XjYj) in 
is larger than 1 — e. Finally by fixing the public coins of Q^, we get a deterministic protocol 
for / with Alice's input Xj and Bob's input Yj such that the communication of is less 
than Di'^ ■''(/) and Bob's success probability (averaged over the inputs XjYj) in is larger 
than 1 — s. This is a contradiction to the definition of De*-'''^(/) (recall that XjYj are distributed 
according to n). Hence it must be that Pr[rj = i|t(''' = l] < 1 — e/2. The claim now follows 
by setting v+i = j. □ 



Open problems 

Some natural questions that arise from this work are: 

1. Can the dependence on t in our direct product theorem be improved? 

2. Can these techniques be extended to show direct product theorems for bounded-round 
quantum communication complexity? 
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